An Improved Proximal Policy Optimization Method for Low-Level Control of a Quadrotor
نویسندگان
چکیده
In this paper, a novel deep reinforcement learning algorithm based on Proximal Policy Optimization (PPO) is proposed to achieve the fixed point flight control of quadrotor. The attitude and position information quadrotor directly mapped PWM signals four rotors through neural network control. To constrain size policy updates, PPO Monte Carlo approximations optimal penalty coefficient. A optimization method with penalized probability distance can provide diversity by performing each update. new proxy objective function introduced into actor–critic network, which solves problem falling local optimization. Moreover, compound reward presented accelerate gradient along update direction analyzing various states that may encounter in flight, improves efficiency network. simulation tests generalization ability offline changing wing length payload Compared method, has higher better robustness.
منابع مشابه
development and implementation of an optimized control strategy for induction machine in an electric vehicle
in the area of automotive engineering there is a tendency to more electrification of power train. in this work control of an induction machine for the application of electric vehicle is investigated. through the changing operating point of the machine, adapting the rotor magnetization current seems to be useful to increase the machines efficiency. in the literature there are many approaches wh...
15 صفحه اولRobust Control of a Quadrotor
In this paper, a robust tracking control method for automatic take-off and trajectory tracking of a quadrotor helicopter is presented. The designed controller includes two parts: a position controller and an attitude controller. The attitude controller is designed by using the sliding mode control (SMC) method to track the desired pitch and roll angles, which are the output of position controll...
متن کاملsolution of security constrained unit commitment problem by a new multi-objective optimization method
چکیده-پخش بار بهینه به عنوان یکی از ابزار زیر بنایی برای تحلیل سیستم های قدرت پیچیده ،برای مدت طولانی مورد بررسی قرار گرفته است.پخش بار بهینه توابع هدف یک سیستم قدرت از جمله تابع هزینه سوخت ،آلودگی ،تلفات را بهینه می کند،و هم زمان قیود سیستم قدرت را نیز برآورده می کند.در کلی ترین حالتopf یک مساله بهینه سازی غیر خطی ،غیر محدب،مقیاس بزرگ،و ایستا می باشد که می تواند شامل متغیرهای کنترلی پیوسته و گ...
Improved Optimization Process for Nonlinear Model Predictive Control of PMSM
Model-based predictive control (MPC) is one of the most efficient techniques that is widely used in industrial applications. In such controllers, increasing the prediction horizon results in better selection of the optimal control signal sequence. On the other hand, increasing the prediction horizon increase the computational time of the optimization process which make it impossible to be imple...
متن کاملProximal Policy Optimization Algorithms
We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a “surrogate” objective function using stochastic gradient ascent. Whereas standard policy gradient methods perform one gradient update per data sample, we propose a novel objective function that enables multiple epochs of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Actuators
سال: 2022
ISSN: ['2076-0825']
DOI: https://doi.org/10.3390/act11040105